39 research outputs found

    Selection of relevant information to improve Image Classification using Bag of Visual Words

    Get PDF
    One of the main challenges in computer vision is image classification. Nowadays the number of images increases exponentially every day; therefore, it is important to classify them in a reliable way.The conventional image classification pipeline usually consists on extracting local image features, encoding them as a feature vector and classify them using a previously created model. With regards to feature codification, the Bag of Words model and its extensions, such as pyramid matching and weighted schemes, have achieved quite good results and have become the state of the art methods.The process as mentioned above is not perfect and computers, as well as humans, may make mistakes in any of the steps, causing a performance drop in classification. Some of the primary sources of error on large-scale image classification are the presence of multiple objects in the image, small or very thin objects, incorrect annotations or fine-grained recognition tasks among others.Based on those problems and the steps of a typical image classification pipeline, the motivation of this PhD thesis was to provide some guidelines to improve the quality of the extracted features to obtain better classification results. The contributions of the PhD thesis demonstrated how a good feature selection can contribute to improving the fine-grained classification, and that there would even be no need to have a big training data set to learn the key features of each class and to predict with good results

    Selection of relevant information to improve Image Classification using Bag of Visual Words

    Get PDF
    One of the main challenges in computer vision is image classification. Nowadays the number of images increases exponentially every day; therefore, it is important to classify them in a reliable way.The conventional image classification pipeline usually consists on extracting local image features, encoding them as a feature vector and classify them using a previously created model. With regards to feature codification, the Bag of Words model and its extensions, such as pyramid matching and weighted schemes, have achieved quite good results and have become the state of the art methods.The process as mentioned above is not perfect and computers, as well as humans, may make mistakes in any of the steps, causing a performance drop in classification. Some of the primary sources of error on large-scale image classification are the presence of multiple objects in the image, small or very thin objects, incorrect annotations or fine-grained recognition tasks among others.Based on those problems and the steps of a typical image classification pipeline, the motivation of this PhD thesis was to provide some guidelines to improve the quality of the extracted features to obtain better classification results. The contributions of the PhD thesis demonstrated how a good feature selection can contribute to improving the fine-grained classification, and that there would even be no need to have a big training data set to learn the key features of each class and to predict with good results

    A new species of Hexacladia Ashmead (Hymenoptera, Encyrtidae) and new record of Hexacladia smithii Ashmead as parasitoids of Dichelops furcatus (Fabricius) (Hemiptera, Pentatomidae) in Argentina

    Get PDF
    Pentatomid adults of the species Dichelops furcatus (F.), collected on stubble of soybean, Glycine max (Linnaeus) Merril, in Santa Fe province of Argentina, were found parasitized by two encyrtid wasp species (Hymenoptera: Encyrtidae). One of the encyrtids is described as Hexacladia dichelopsis Torréns & Fidalgo, sp. n., from both sexes, and the other species H. smithii Ashmead, is recorded for the first time from D. furcatus in Argentina. Both species are gregarious endoparasitoids which carry out the whole development (larval and pupal) in their living hosts; they emerge as imagoes, by cutting their way out through the dorsal wall of the abdomen. Including the newly described H. dichelopsis, seven species of the genus are recorded from South America, and an identification key to separate them is presented. Copyright Javier Torréns et al.Fil: Torrens, Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Universidad Nacional de La Rioja. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Universidad Nacional de Catamarca. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Secretaría de Industria y Minería. Servicio Geológico Minero Argentino. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Provincia de La Rioja. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja; ArgentinaFil: Fidalgo, Alberto Antonio P.. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Universidad Nacional de La Rioja. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Universidad Nacional de Catamarca. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Secretaría de Industria y Minería. Servicio Geológico Minero Argentino. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Provincia de La Rioja. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja; ArgentinaFil: Fernández, Celina Ana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Universidad Nacional de La Rioja. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Universidad Nacional de Catamarca. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Secretaría de Industria y Minería. Servicio Geológico Minero Argentino. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja. - Provincia de La Rioja. Centro Regional de Investigaciones Científicas y Transferencia Tecnológica de La Rioja; ArgentinaFil: Punschke, Eduardo. Universidad Nacional de Rosario; Argentin

    A data augmentation strategy for improving age estimation to support CSEM detection

    Get PDF
    [EN] Leveraging image-based age estimation in preventing Child Sexual Exploitation Material (CSEM) content over the internet is not investigated thoroughly in the research community. While deep learning methods are considered state-of-the-art for general age estimation, they perform poorly in predicting the age group of minors and older adults due to the few examples of these age groups in the existing datasets. In this work, we present a data augmentation strategy to improve the performance of age estimators trained on imbalanced data based on synthetic image generation and artificial facial occlusion. Facial occlusion is focused on modelling as CSEM criminals tend to cover certain parts of the victim, such as the eyes, to hide their identity. The proposed strategy is evaluated using the Soft Stagewise Regression Network (SSR-Net), a compact size age estimator and three publicly available datasets composed mainly of non-occluded images. Therefore, we create the Synthetic Augmented with Occluded Faces (SAOF-15K) dataset to assess the performance of eye and mouthoccluded images. Results show that our strategy improves the performance of the evaluated age estimator

    A Video Summarization Approach to Speed-up the Analysis of Child Sexual Exploitation Material

    Get PDF
    [Abstract] Identifying key content from a video is essential for many security applications such as motion/action detection, person re-identification and recognition. Moreover, summarizing the key information from Child Sexual Exploitation Materials, especially videos, which mainly contain distinctive scenes including people’s faces is crucial to speed-up the investigation of Law Enforcement Agencies. In this paper, we present a video summarization strategy that combines perceptual hashing and face detection algorithms to keep the most relevant frames of a video containing people’s faces that may correspond to victims or offenders. Due to legal constraints to access Child Sexual Abuse datasets, we evaluated the performance of the proposed strategy during the detection of adult pornography content with the NDPI-800 dataset. Also, we assessed the capability of our strategy to create video summaries preserving frames with distinctive faces from the original video using ten additional short videos manually labeled. Results showed that our approach can detect pornography content with an accuracy of 84.15% at a speed of 8.05 ms/frame making this appropriate for realtime applications.This work was supported by the framework agreement between the Universidad de León and INCIBE (Spanish National Cybersecurity Institute) under Addendum 01. Also, this research has been funded with support from the European Commission under the 4NSEEK project with Grant Agreement 821966. This publication reflects the views only of the authors, and the European Commission cannot be held responsible for any use which may be made of the information contained therein. Finally, we acknowledge the NVIDIA Corporation for the donation of the TITAN Xp GPU

    Supervised ranking approach to identify infLuential websites in the darknet

    Get PDF
    [EN] The anonymity and high security of the Tor network allow it to host a significant amount of criminal activities. Some Tor domains attract more traffic than others, as they offer better products or services to their customers. Detecting the most influential domains in Tor can help detect serious criminal activities. Therefore, in this paper, we present a novel supervised ranking framework for detecting the most influential domains. Our approach represents each domain with 40 features extracted from five sources: text, named entities, HTML markup, network topology, and visual content to train the learning-to-rank (LtR) scheme to sort the domains based on user-defined criteria. We experimented on a subset of 290 manually ranked drug-related websites from Tor and obtained the following results. First, among the explored LtR schemes, the listwise approach outperforms the benchmarked methods with an NDCG of 0.93 for the top-10 ranked domains. Second, we quantitatively proved that our framework surpasses the link-based ranking techniques. Third, we observed that using the user-visible text feature can obtain comparable performance to all the features with a decrease of 0.02 at NDCG@5. The proposed framework might support law enforcement agencies in detecting the most influential domains related to possible suspicious activities.SIPublicación en abierto financiada por el Consorcio de Bibliotecas Universitarias de Castilla y León (BUCLE), con cargo al Programa Operativo 2014ES16RFOP009 FEDER 2014-2020 DE CASTILLA Y LEÓN, Actuación:20007-CL - Apoyo Consorcio BUCL

    Detecting emerging products in TOR network based on K-Shell graph decomposition

    Get PDF
    En este documento, presentamos un marco semiautomático que permite identificar los más populares y también, algunos de los productos ilegales emergentes que se venden en los mercados que se encuentran en la red oscura (Darknet). Utilizando información textual extraída de los dominios de Darknet, construimos un gráfico de correlaciones de productos (PCG), donde los nodos son productos Darknet y los bordes reflejan una oferta simultánea de dos productos. Aplicando el algoritmo k-Shell para descomponer el gráfico PCG, identificamos los productos contenidos en el núcleo e identificamos los más populares y emergentes. Aplicamos nuestro algoritmo de detección de emergencia al conjunto de datos denominado direcciones de texto de uso de Darknet (DUTA), detectando MDMA y éxtasis como las drogas más relevantes y emergentes, respectivamente, validando estos resultados con el Informe de prestigiosas organizaciones internacionales de drogas. Estos resultados hacen de nuestro marco una herramienta complementaria para extraer información en los mercados ilegales donde no se muestran los registros de transacciones

    Automatic classification of pores in aluminum castings using machine learning

    Get PDF
    [Resumen] La inspección de la porosidad de piezas fabricadas se ha realizado tradicionalmente mediante el uso de microscopía manipulada por parte de un técnico humano. Sin embargo, la persona involucrada necesita experiencia en esta tarea, y la cantidad de piezas que se pueden inspeccionar por unidad de tiempo es limitada. La presencia de porosidad en el material es crítica, ya que puede afectar negativamente a las propiedades mecánicas y la calidad de la pieza. En este trabajo se propone automatizar la clasificación de los defectos de porosidad que aparecen en el interior de las piezas fabricadas por fundición. En primer lugar, adquirimos imágenes a partir de piezas de aluminio fabricadas por dos métodos de fundición: uno tradicional usando molde de arena y otro más moderno con la técnica de fabricación aditiva Binder Jetting (BJ). Luego, recortamos regiones con y sin poros, que posteriormente caracterizamos usando descriptores SIFT codificados en características de BoVW para alimentar y entrenar dos clasificadores SVM: uno para predecir si la imagen contiene poro o no, y el otro para indicar si el poro detectado es debido al efecto de gases o por contracción durante la solificación.[Abstract] Porosity inspection of manufactured parts has traditionally been performed using microscopy manipulated by a human technician. However, the person involved needs experience in this task, and the number of parts that can be inspected per unit of time is limited. The presence of porosity in the material is critical, as it can negatively affect the mechanical properties and the quality of the part. In this paper, we propose to automate the classification of the porosity defects that appear inside the parts manufactured by casting. First, we acquire images from aluminum parts manufactured by two casting methods: a traditional one using sand molding and a more modern one with the Binder Jetting (BJ) additive manufacturing technique. Then, we crop regions with and without pores we later describe using SIFT descriptors encoded into BoVW features to feed and train two SVM classifiers: one for predicting if the image contains a pore or not, and the other for also indicating if the pore detected is due to the effect of gases or by shrinkage during solidification.Ministerio de Ciencia, Innovación y Universidades; DPI2017-89840-

    Clasificacion de imagenes con bag of visual words

    Get PDF
    Cap. 10- pp. 181-200La clasificación de imágenes es un proceso mediante el cual un ordenador es capaz de decidir qué contenidos están presentes en una imagen, esto es a qué clase pertenece o qué objetos contiene. En los últimos años el modelo Bag of Visual Words (BoVW) se ha convertido en una de las soluciones más utilizadas para realizar esta tarea. El término visual word (palabra visual, o simplemente “palabra”) hace referencia a una pequeña parte de una imagen. El BoVW consta de varias etapas: un muestreo de puntos característicos (keypoints) de la imagen, la descripción de los mismos, la creación de un diccionario de palabras visuales mediante un proceso de agrupamiento, la representación de las imágenes a nivel global utilizando este diccionario y, finalmente, una clasificación de estas representaciones para decidir la clase a la que pertenece. En este capítulo se explicará el modelo BoVW de clasificación de imágenes, detallando estas etapas
    corecore